# `cube_dbt` package

`cube_dbt` package simplifies defining the data model in the semantic layer
on top of [dbt][link-dbt-docs] models. It provides convenient tools for
loading the metadata of a dbt project, inspecting dbt models, and rendering
them as [cubes][ref-cubes] in [YAML][ref-model-syntax].

* Install [`cube_dbt`][link-cube-dbt-pypi] package from PyPI
* Check the source code in [`cube_dbt`][link-cube-dbt-repo] on GitHub
* Submit issues to [`cube`][link-cube-repo-issues] on GitHub

## Installation

{/*

### Cube Core

Run the following command in the root directory of your Cube project:

```bash
echo "cube_dbt" > requirements.txt
pip install -r requirements.txt
```

*/}

{/* TODO: test and uncomment Cube Core installation instructions */}

### Cube Cloud

Add the `cube_dbt` package to the `requirements.txt` file in the root
directory of your Cube project. Cube Cloud will install the dependencies
automatically.

## Reference

### `Dbt` class

Encapsulates tools for working with the metadata of a dbt project.

#### `Dbt.__init__`

The constructor accepts the metadata of a dbt project as a `dict` with the
contents of a [`manifest.json` file][link-dbt-manifest].

```python
import json
from cube_dbt import Dbt

manifest_path = './manifest.json'

with open(manifest_path, 'r') as file:
  manifest = json.loads(file.read())
  dbt = Dbt(manifest)
```

Use in cases when `Dbt.from_file` and `Dbt.from_url` aren't applicable,
e.g., when `manifest.json` is loaded from a private AWS S3 bucket.

#### `Dbt.from_file`

This static method loads the metadata of a dbt project from a `manifest.json`
file by its path and returns an instance of the `Dbt` class.

```python
from cube_dbt import Dbt

manifest_path = './manifest.json'

dbt = Dbt.from_file(manifest_path)
```

#### `Dbt.from_url`

This static method loads the metadata of a dbt project from a `manifest.json`
file by its URL and returns an instance of the `Dbt` class.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
```

#### `Dbt.filter`

This method filters loaded dbt models by their path prefixes, tags, or names.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url).filter(
  paths=['marts/'],  # Only models under the 'marts/' path
  tags=['cube'],     # Only models with the 'cube' tag
  names=['orders']   # Only the 'orders' model 
)
```

Use to expose only necessary dbt models to the semantic layer.

Note that values in `paths` should not be prefixed with `models/`.

#### `Dbt.models`

This property exposes a list of loaded dbt models as instances of the
`Model` class.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)

for model in dbt.models:
  print(model)
```

Only dbt models that comply with `Dbt.filter` rules and are not
materialized as [ephemeral][link-dbt-materializations] will be returned.

#### `Dbt.model`

This method returns a loaded dbt model by its name as an instance of the
`Model` class.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)

model = dbt.model('orders')
print(model)
```

Only dbt models that comply with `Dbt.filter` rules and are not
materialized as [ephemeral][link-dbt-materializations] will be returned.

### `Model` class

Encapsulates tools for working with the metadata of a dbt model.

#### `Model.name`

This property exposes the name of a dbt model.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.name)
# For example, 'orders'
```

#### `Model.description`

This property exposes the description of a dbt model.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.description)
# For example, 'All Jaffle Shop orders'
```

#### `Model.sql_table`

This property exposes the fully-qualified SQL relation name of a dbt model
that can be used as the [`sql_table` parameter][ref-cube-sql-table] of a cube.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.sql_table)
# For example, '"db"."public"."orders"'
```

#### `Model.columns`

This property exposes a list of columns that belong to this dbt model as
instances of the `Column` class.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

for column in model.columns:
  print(column)
```

#### `Model.column`

This method exposes a column that belongs to this dbt model by its name as
an instance of the `Column` class.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

column = model.column('status')
print(column)
```

#### `Model.primary_key`

This method returns the primary key column, if this dbt model has any, as an
instance of the `Column` class. Returns `None` if there's no primary key in
this dbt model.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.primary_key)
```

See [`Column.primary_key`][self-column-pk] for details on the detection of
primary key columns.

#### `Model.as_cube`

This method renders this dbt model as a YAML snippet that can be inserted
into YAML data models. Includes `name`, `description` (if present), and
`sql_table`.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.as_cube())
```

In the returned multiline string, all lines except for the first one are
left-padded with 4 spaces for easier use in YAML data models:

```yaml
# Jinja template
cubes:
  - {{ model.as_cube() }}

# YAML
cubes:
  - name: orders
    description: All Jaffle Shop orders
    sql_table: '"db"."public"."orders"'
```

#### `Model.as_dimensions`

This method renders the list of columns that belong to this dbt model as
a YAML snippet that can be inserted into YAML data models.

Optionally, accepts a list of column names that should be ignored in `skip`.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')

print(model.as_dimensions(skip=['status']))
```

See [`Column.as_dimension`][self-column-as-dimension] for details on the
dimension rendering.

In the returned multiline string, all lines except for the first one are
left-padded with 6 spaces for easier use in YAML data models:

```yaml
# Jinja template
cubes:
  - {{ model.as_cube() }}

    dimensions:
      {{ model.as_dimensions() }}

# YAML
cubes:
  - name: orders
    description: All Jaffle Shop orders
    sql_table: '"db"."public"."orders"'

    dimensions:
      - name: id
        sql: id
        type: number
        primary_key: true
```

### `Column` class

Encapsulates tools for working with the metadata of a column that belongs
to a dbt model.

#### `Column.name`

This property exposes the name of a column.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.name)
# For example, 'status'
```

#### `Column.description`

This property exposes the description of a column.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.description)
# For example, 'Order execution status: new, in progress, delivered'
```

#### `Column.sql`

This property exposes the name of a column that can be used as the
[`sql` parameter][ref-dimension-sql] of a dimension.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.sql)
# For example, 'status'
```

#### `Column.type`

This property exposes the data type of a column that can be used as the
[`type` parameter][ref-dimension-type] of a dimension.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.type)
# For example, 'string'
```

Column types should match [dimension types][ref-dimension-types] or a few
supported aliases:

| Column types                            | Dimension type |
| --------------------------------------- | -------------- |
| `time`, `date`, `datetime`, `timestamp` | `time`         |
| `string`                                | `string`       |
| `number`, `numeric`                     | `number`       |
| `boolean`, `bool`                       | `boolean`      |
| `geo`, `geography`                      | `geo`          |

If a column type is not defined in the metadata of a dbt project, `string`
is used by default.

#### `Column.meta`

This property exposes the meta data of a column as a `dict` that can be
used as the [`meta` parameter][ref-dimension-meta] of a dimension.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.meta)
# For example, '{some: "data"}'
```

#### `Column.primary_key`

This property exposes a `bool` value that indicates if a column is
a primary key or not.

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.primary_key)
# For example, 'False'
```

By convention, the column is considered a primary key if it has the
`primary_key` tag in the metadata of a dbt project.

#### `Column.as_dimension`

This method renders this column as a YAML snippet that can be inserted
into YAML data models. Includes `name`, `description` (if present), `sql`,
`type`, `primary_key` (if `True`), and `meta` (if present).

```python
from cube_dbt import Dbt

manifest_url = 'https://bucket.s3.amazonaws.com/manifest.json'

dbt = Dbt.from_url(manifest_url)
model = dbt.model('orders')
column = model.column('status')

print(column.as_dimension())
```

In the returned multiline string, all lines except for the first one are
left-padded with 8 spaces for easier use in YAML data models:

```yaml
# Jinja template
cubes:
  - {{ model.as_cube() }}

    dimensions:
      {% for column in model.columns() %}
      - {{ column.as_dimension() }}
      {% endfor %}

# YAML
cubes:
  - name: orders
    description: All Jaffle Shop orders
    sql_table: '"db"."public"."orders"'

    dimensions:
      - name: id
        sql: id
        type: number
        primary_key: true

      - name: status
        description: 'Order execution status: new, in progress, delivered'
        sql: status
        type: string
        meta:
          some: data
```


[link-dbt-docs]: https://docs.getdbt.com/docs/build/projects
[link-cube-dbt-repo]: https://github.com/cube-js/cube_dbt
[link-cube-dbt-pypi]: https://pypi.org/project/cube_dbt/
[link-cube-repo-issues]: https://github.com/cube-js/cube/issues
[link-dbt-manifest]: https://docs.getdbt.com/reference/artifacts/manifest-json
[link-dbt-materializations]: https://docs.getdbt.com/docs/build/materializations

[ref-cubes]: /reference/data-model/cube
[ref-model-syntax]: /product/data-modeling/syntax#model-syntax
[ref-cube-sql-table]: /reference/data-model/cube#sql_table
[ref-dimension-sql]: /reference/data-model/dimensions#sql
[ref-dimension-type]: /reference/data-model/dimensions#type
[ref-dimension-meta]: /reference/data-model/dimensions#meta
[ref-dimension-types]: /reference/data-model/types-and-formats#dimension-types

[self-column-pk]: #columnprimary_key
[self-column-as-dimension]: #columnas_dimension